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ABSTRACT 


Introduction: Etiological models use identical estimation procedures as most predictive modelling (i.e., regression) to quantify 
the relative risk related to a selected exposure on an outcome. Though regression is usually used for both purposes, the way 
within which the model is built will differ thanks to the goals of the model. The goal of a prediction model differs in several im- 
portant ways. 


Mathods: Using a cohort design that links baseline risk factors to a validated population-based diabetes registry, a model (Dia- 
betes Population Risk Tool or DPoRT) to predict risk factors for diabetes using commonly-collected national survey data was 
developed and validated. The event cohort was the National Population Health Survey (NPHS) linked to the validated Diabetes 
Database, a provincial component of the National Diabetes closed-circuit television (NDSS). Variables were restricted to factors 
routinely measured within the population. The probability of developing diabetes was modelled using sex-specific survival func- 
tions for those > 20 years, without diabetes and not pregnant at baseline (N = 19,000). 


Results: The age-standardized 5-year incidence rates in the development cohorts were 6.52 % for males and 5.42 % for fe- 
males. The 3-year age-standardized incidence rates in the development cohort were 3.42 % for males and 2.41% for females. 
The age-standardized 5-year incidence rates in the development cohorts were 6.42 % for males and 4.20 % for females. The 
age-standardized 3-year incidence rates for validation cohort was 3.45 % for males and 3.22 % for females. 


Conclusion: Determinants of weight and weight change are essential when developing strategies to prevent or reduce the fu- 
ture diabetes burden. In monitoring trends over time researchers are often faced with the dilemma of separating trends between 
individuals and trends within individuals. Multilevel growth models allow us to model both these aspects which strengthen the 
ability to model trends that vary between and within individuals. 


Key Words: Diabetes Burden, Framingham Heart Score, Multilevel Growth Models, National Population Health Survey, Predic- 
tion Models 


INTRODUCTION 


In many scientific disciplines, the study that predicts or fore- 
cast what is going to happen within the future has contribut- 
ed to our understanding of the planet. The worth of scientific 
studies that provide models to tell strategies that may modify 
and possibly mitigate future events is of importance to so- 
ciety. Examples include estimating the impact of climate or 
environmental changes on the earth‘s ecosystems or the im- 
pact of policy changes on the economy.'? These prediction 
models are accepted as valuable tools by scientists and have 
provided critical information for the event of strategies to 


change predicted trends*> within the field of epidemiology, 
prediction models are underrepresented and also the concept 
of risk prediction is overshadowed by the estimation of rela- 
tive risk measures to clarify etiological perspectives of dis- 
ease. Etiological models use identical estimation procedures 
as most predictive modelling (i.e., regression) to quantify the 
relative risk related to a selected exposure on an outcome. 
Though regression is usually used for both purposes, the way 
within which the model is built will differ thanks to the goals 
of the model. The goal of a prediction model differs in sever- 
al important ways. First, the result which must be optimized 
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is an absolute measure of risk, often expressed as percentage 
or probability (versus a risk or hazard ratio). Second, the goal 
of a prediction model is to maximise the flexibility to dis- 
criminate between in danger groups and to properly classify 
true risk, referred to as discrimination and calibration. Typi- 
cally these indices don’t seem to be evaluated in etiological 
models. Thirdly, prediction models must be generalizable in 
other populations to which the model is applied. Typically 
etiological models fit the information accustomed to gener- 
ate the relative risk estimate as tightly as possible and as a re- 
sult, might not be reproducible using data in other settings or 
might not be applicable to explain risk in another population. 
These goals change the factors for model assessment and 
concomitantly the methodological framework is employed. 


In medicine, prediction models, within the type of risk algo- 
rithms, are used as tools for patient decision-making. A risk 
algorithm could be a tool wont to estimate absolutely the risk 
of an outcome for a person as a function of their baseline 
characteristics. Typically, the risk is expressed because of the 
probability of dying or developing a disease in a very given 
fundamental measure.° One in every of the foremost utilized 
risk tools is that the Framingham Heart Score.’ This tool 
is employed to calculate the probability that a patient will 
develop coronary heart condition in 5 or 10 years and has 
been widely integrated into disorder prevention and manage- 
ment throughout the planet.*'? Risk algorithms are widely 
recommended by medical societies for appropriate identifi- 
cation of patients that may have the benefit of specific in- 
terventions. This is often exemplified in clinical guidelines 
for pharmacologic interventions like cholesterol-lowering 
medications.'? Several potential benefits are also realized 
by extending the appliance of those tools to the population 
level. Just like the individual level, at the population setting 
predictive risk tools have the potential of providing insight 
into the longer-term burden of a disease in a complete region 
or nation and therefore the influence of specific risk factors. 
These tools can support health care higher cognitive process, 
including the effective and efficient allocation and distribu- 
tion of health care resources and plan for effective disease 
prevention interventions. To date, prediction tools specifi- 
cally designed to be used at the population level are neither 
created nor used for planning. 


MATERIALS AND METHODS 


Using a cohort design that links baseline risk factors to a val- 
idated population-based diabetes registry, a model (Diabetes 
Population Risk Tool or DPoRT) to predict risk factors for 
diabetes using commonly-collected national survey data was 
developed and validated. The event cohort was the National 
Population Health Survey (NPHS) linked to the validated 
Diabetes Database, a provincial component of the National 
Diabetes closed-circuit television (NDSS). Variables were 


restricted to factors routinely measured within the popula- 
tion. The probability of developing diabetes was modelled 
using sex-specific survival functions for those > 20 years, 
without diabetes and not pregnant at baseline (N = 19,000). 
[Ethical clearance no. BRLSABV7/07/2019] 


The model was validated in two external validation cohorts, 
both linked to administrative data for NDSS-defined physi- 
cian-diagnosed diabetes. Predictive accuracy was assessed 
by comparing observed physician-diagnosed diabetes rates 
with predicted risk estimates from DPoRT. Discrimination of 
the model was assessed employing a C statistic and calibra- 
tion was assessed with the chi-square statistic. 


RESULTS 


In the development cohort, 700 males and 665 females de- 
veloped physician-diagnosed diabetes within the 5 year fol- 
low-up period. The age-standardized 5-year incidence rates 
in the development cohorts were 6.52% for males and 5.42% 
for females. The 3-year age-standardized incidence rates in 
the development cohort were 3.42% for males and 2.41% for 
females. The age-standardized 5-year incidence rates in the 
development cohorts were 6.42% for males and 4.20 % for 
females. The age-standardized 3-year incidence rates for vali- 
dation cohort was 3.45 % for males and 3.22 % for females. 
All baseline population characteristics in the derivation cohort 
and two validation cohorts are shown in table 1. Both the 
validation cohorts differed from the derivation cohort. There 
were similar in age distribution, however, both had a higher 
proportion of obese individuals. Compared to the derivation 
cohort had a higher baseline prevalence of hypertension and 
heart disease but a lower prevalence of smoking while the oth- 
er cohort had higher levels of hypertension and heart disease 
compared to the derivation cohort in women only. 


Table 1: Baseline characteristics of development and 
validation cohorts with DPoRT multivariate-adjust- 
ed hazard ratios and 95% confidence intervals for 
5-year physician diagnosed diabetes for males 








Risk Factor Subset CI 
Intercept 10.5861 
Hypertension No - 
Yes -0.2413 
Non-white Ethnicity No = 
Yes -0.6214 
Heart Disease No - 
Yes -0.5344 
Current Smoker No - 
Yes -0.1642 
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Table 1: (Continued) 


Risk Factor Subset CI 


Education < Post-secondary - 
Secondary 0.2222 

Age-BMI category BMI<23*Age<45 = 
23<BMI<25*Age<45 -1.1364 
25<BMI<30*Age<45 -1.4280 
30<BM1<35*Age<45 -2.4324 
BMI235*Age<45 -3-3704 
BMI<23*Age245 -1.8463 
23<BMI<25*Age245 -2.2225 
25<BMI<30*Age=45 -2.5466 
30<BM1<35*Age=45 -3.2163 
BMI235*Age245 -3-4555 

Scale 0.6026 


Table 2: Baseline characteristics of development and 
validation cohorts with DPoRT multivariate-adjust- 
ed hazard ratios and 95% confidence intervals for 
5-year physician-diagnosed diabetes for females 


Risk Factor Subset Cl 
Intercept 10.3272 
Hypertension Yes - 
No -0.2642 
Non-white Ethnicity Yes - 
No -0.3207 


Immigrant Status < Post-secondary - 


Secondary 0.2130 
Education < Post-secondary - 
Secondary 0.2130 
Age-BMI category BMI<23*Age<45 = 
23<BMI<25*Age<45 -0.4230 
25<BMI<30*Age<45 -0.4361 
30<BMI1<35*Age<45 -1.2102 
BMI235*Age<45 -2.0440 
BMI = missing*Age<45 -1.1224 
BMI<23*45<Age<65 0.0610 
23<BMI1<25*45<Age<65 -0.6021 
25<BMI<30*45<Age<65 -1.2132 
30<BMI<35*45<Age<65 -2.1140 
BMI235*45<Age<65 -2.1665 
BMI = -1.6240 
missing*45<Age<65 
BMI<23*Age=65 -1.0413 
23<BMI<25*Age265 -1.1215 
25<BMI<30*Age=65 -1.5765 


Table 2: (Continued) 


Risk Factor Subset cI 
30<BMI<35*Age=65 -1.6142 
BMI=35*Age>65 -1.6552 
BMI = missing*Age=65 -1.6552 
Scale 0.6302 


DISCUSSION 


This study demonstrated that diabetes risk can be accu- 
rately predicted at the population level using self-reported 
age, sex, Body Mass Index and other measures available in 
population health surveys. In addition to displaying good 
discrimination, DPoRT-predicted rates closely agreed with 
observed rates for both males and females in both external 
validation cohorts, and this agreement was generally main- 
tained across deciles and quintiles of risk. To my knowl- 
edge, DPoRT is the first validated risk tool that is integrated 
into commonly-collected population health survey data. 
DPORT offers advantages over existing methods used to 
estimate future diabetes risk in populations. Previous stud- 
ies that estimate future diabetes burden have either extrap- 
olated overall trends in diabetes prevalence or indirectly 
incorporated information on the influence of risk factors 
with various assumptions." Other studies focus on overall 
diabetes burden, a useful approach, but one which does not 
enable users to directly assess the impact of risk factors, 
such as BMI, on future diabetes. Furthermore, these stud- 
ies did not assess how future diabetes can be prevented by 
targeting risk factors since they do not directly quantify the 
influence of risk factors on baseline risk or diabetes inci- 
dence. 


Complex modelling and simulation studies differ from the 
approach used in this study in that they use additional in- 
formation on how populations and risk factors change over 
time.!*!5 Other simulation studies add more detailed clinical 
information such as fasting blood sugar level or information 
on diabetes family history, data not available at the popula- 
tion level. Strength of these simulation models is that they 
can combine different data sources and study findings!® How- 
ever, these models are complex and often represent clinical 
or theoretical populations, making their estimates difficult to 
validate in external populations that are meaningful for pop- 
ulation health planning. DPoRT could be incorporated into 
simulation models that consider future changes in popula- 
tion composition and risk factors. The nature of diabetes risk 
allowed us to discriminate and explain risk using a limited 
number of variables — most importantly BMI. Discrimina- 
tion of DPoRT is as high as or higher than many clinical risk 
prediction tools used in clinical practice. The algorithm was 
further calibrated using population means, which may atten- 


el _—_— Int J Cur Res Rev | Vol 12 + Issue 21 e November 2020 i 


Kumar et al.: Risk prediction for diabetes mellitus - a population based approach 


uate differences between populations since risk estimates are 
relative to baseline risk in the population. 


Given current data in most countries, DPoRT is a more bal- 
anced approach to estimating diabetes risk than methods 
used in previous research. Several important clinical values 
are excluded from DPoRT, such as hip to waist ratio, waist 
circumference, fasting blood glucose, and family history. ° 
Although these variables may be clinically important for as- 
sessing diabetes risk, adding these, or other detailed anthro- 
pometric measures is not feasible because they are not rou- 
tinely collected in most populations. These omitted variables 
are unlikely to have a major impact on the performance char- 
acteristics of the model due to the clustering of risk factors, 
particularly when dealing with abnormalities of the meta- 
bolic system.”°?! Variables not included in DPoRT, such as 
the family history of diabetes or poor diet, are also associated 
with the clustering of metabolic risk factors that are included 
in the algorithm such as hypertension and BMI. Obesity is 
the most important factor in predicting diabetes risk. BMI is 
the most commonly used marker of obesity; however, meas- 
ures of central obesity may capture the entire risk domain 
more comprehensively and be more meaningful across all 
age groups.” A recent meta-analysis has shown that there is 
no evidence of a difference in estimates associated with inci- 
dent diabetes between BMI, waist circumference and waist/ 
hip ratios.” Furthermore, algorithms to identify individuals 
for weight loss in populations did not differ if using BMI or 
waist circumference.” To ensure DPoRT can be applied in 
different populations, we gave preference to variables that 
were: based on established evidence, remained stable over 
time, were unlikely to be subject to serious measurement 
error (such as alcohol and dietary habits), and were easily 
captured using survey data in different populations. For ex- 
ample, physical activity has been shown to have a protective 
effect on diabetes incidence” but was removed from the final 
algorithm due to the inability to capture this in a reliable and 
reproducible manner across studies, and because of it mar- 
ginal improvement in the discrimination of diabetes risk in 
our creation cohort. Despite placing considerable constraints 
on variable selection as a means of ensuring maximum fea- 
sibility, DPoRT maintained good discrimination. The effect 
of self-reported BMI may depend on the population where 
the algorithm is being applied since these patterns have been 
shown to vary across gender and socioeconomic status.?°?8 
The ability of DPoRT‘s predictive estimates to agree with 
observed diabetes risk in different populations will be re- 
duced if systematic errors associated with responses vary 
across populations or time. 


If diabetes testing/screening increases over time predicted 
estimates could be lower than the observed estimates (under 
the assumption that this would lead to increased case detec- 
tion). DPoRT is accurate in different populations for differ- 
ent periods; however, DPoRT could be recalibrated to predict 


total diabetes cases using revised information on screening/ 
testing or using estimates of the number of undiagnosed cas- 
es relative to diagnosed cases in the population. Finally, the 
potential for inaccuracy increases the longer into the future 
the predictions are made or when unforeseen changes occur; 
therefore, it is recommended that predictions from DPoRT 
are updated frequently by using the most recent data, limit- 
ing predictive calculations to 5 years or less, and validating 
the risk tool in the population where it is being applied. 


CONCLUSION 


In building the risk tool for diabetes it was demonstrated 
that that BMI (a relative measure of weight for height) over- 
whelmingly influences the predictions for developing dia- 
betes in the future. For that reason, clarifying determinants 
of weight and weight change is essential when developing 
strategies to prevent or reduce the future diabetes burden. In 
monitoring trends over time researchers are often faced with 
the dilemma of separating trends between individuals and 
trends within individuals. Multilevel growth models allow 
us to model both these aspects which strengthen the ability 
to model trends that vary between and within individuals. 
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